Pesquisa | Portal Regional da BVS

1.

Robust Differential Abundance Analysis of Microbiome Sequencing Data.

Li, Guanxun; Yang, Lu; Chen, Jun; Zhang, Xianyang.

Genes (Basel) ; 14(11)2023 Oct 26.

Artigo em Inglês | MEDLINE | ID: mdl-38002943

RESUMO

It is well known that the microbiome data are ridden with outliers and have heavy distribution tails, but the impact of outliers and heavy-tailedness has yet to be examined systematically. This paper investigates the impact of outliers and heavy-tailedness on differential abundance analysis (DAA) using the linear models for the differential abundance analysis (LinDA) method and proposes effective strategies to mitigate their influence. The presence of outliers and heavy-tailedness can significantly decrease the power of LinDA. We investigate various techniques to address outliers and heavy-tailedness, including generalizing LinDA into a more flexible framework that allows for the use of robust regression and winsorizing the data before applying LinDA. Our extensive numerical experiments and real-data analyses demonstrate that robust Huber regression has overall the best performance in addressing outliers and heavy-tailedness.

Assuntos

Microbiota , Microbiota/genética

2.

Interpretable modeling of time-resolved single-cell gene-protein expression with CrossmodalNet.

Yang, Yongjian; Lin, Yu-Te; Li, Guanxun; Zhong, Yan; Xu, Qian; Cai, James J.

Brief Bioinform ; 24(6)2023 09 22.

Artigo em Inglês | MEDLINE | ID: mdl-37798250

RESUMO

Cell-surface proteins play a critical role in cell function and are primary targets for therapeutics. CITE-seq is a single-cell technique that enables simultaneous measurement of gene and surface protein expression. It is powerful but costly and technically challenging. Computational methods have been developed to predict surface protein expression using gene expression information such as from single-cell RNA sequencing (scRNA-seq) data. Existing methods however are computationally demanding and lack the interpretability to reveal underlying biological processes. We propose CrossmodalNet, an interpretable machine learning model, to predict surface protein expression from scRNA-seq data. Our model with a customized adaptive loss accurately predicts surface protein abundances. When samples from multiple time points are given, our model encodes temporal information into an easy-to-interpret time embedding to make prediction in a time-point-specific manner, and is able to uncover noise-free causal gene-protein relationships. Using three publicly available time-resolved CITE-seq data sets, we validate the performance of our model by comparing it with benchmarking methods and evaluate its interpretability. Together, we show that our method accurately and interpretably profiles surface protein expression using scRNA-seq data, thereby expanding the capacity of CITE-seq experiments for investigating molecular mechanisms involving surface proteins.

Assuntos

Algoritmos , Perfilação da Expressão Gênica , Perfilação da Expressão Gênica/métodos , Análise de Sequência de RNA/métodos , Análise de Célula Única/métodos , Proteínas de Membrana

3.

Gene knockout inference with variational graph autoencoder learning single-cell gene regulatory networks.

Yang, Yongjian; Li, Guanxun; Zhong, Yan; Xu, Qian; Chen, Bo-Jia; Lin, Yu-Te; Chapkin, Robert S; Cai, James J.

Nucleic Acids Res ; 51(13): 6578-6592, 2023 07 21.

Artigo em Inglês | MEDLINE | ID: mdl-37246643

RESUMO

In this paper, we introduce Gene Knockout Inference (GenKI), a virtual knockout (KO) tool for gene function prediction using single-cell RNA sequencing (scRNA-seq) data in the absence of KO samples when only wild-type (WT) samples are available. Without using any information from real KO samples, GenKI is designed to capture shifting patterns in gene regulation caused by the KO perturbation in an unsupervised manner and provide a robust and scalable framework for gene function studies. To achieve this goal, GenKI adapts a variational graph autoencoder (VGAE) model to learn latent representations of genes and interactions between genes from the input WT scRNA-seq data and a derived single-cell gene regulatory network (scGRN). The virtual KO data is then generated by computationally removing all edges of the KO gene-the gene to be knocked out for functional study-from the scGRN. The differences between WT and virtual KO data are discerned by using their corresponding latent parameters derived from the trained VGAE model. Our simulations show that GenKI accurately approximates the perturbation profiles upon gene KO and outperforms the state-of-the-art under a series of evaluation conditions. Using publicly available scRNA-seq data sets, we demonstrate that GenKI recapitulates discoveries of real-animal KO experiments and accurately predicts cell type-specific functions of KO genes. Thus, GenKI provides an in-silico alternative to KO experiments that may partially replace the need for genetically modified animals or other genetically perturbed systems.

Assuntos

Redes Reguladoras de Genes , Análise de Célula Única , Animais , Técnicas de Inativação de Genes , Regulação da Expressão Gênica , Análise de Sequência de RNA , Perfilação da Expressão Gênica

4.

scTenifoldXct: A semi-supervised method for predicting cell-cell interactions and mapping cellular communication graphs.

Yang, Yongjian; Li, Guanxun; Zhong, Yan; Xu, Qian; Lin, Yu-Te; Roman-Vicharra, Cristhian; Chapkin, Robert S; Cai, James J.

Cell Syst ; 14(4): 302-311.e4, 2023 04 19.

Artigo em Inglês | MEDLINE | ID: mdl-36787742

RESUMO

We present scTenifoldXct, a semi-supervised computational tool for detecting ligand-receptor (LR)-mediated cell-cell interactions and mapping cellular communication graphs. Our method is based on manifold alignment, using LR pairs as inter-data correspondences to embed ligand and receptor genes expressed in interacting cells into a unified latent space. Neural networks are employed to minimize the distance between corresponding genes while preserving the structure of gene regression networks. We apply scTenifoldXct to real datasets for testing and demonstrate that our method detects interactions with high consistency compared with other methods. More importantly, scTenifoldXct uncovers weak but biologically relevant interactions overlooked by other methods. We also demonstrate how scTenifoldXct can be used to compare different samples, such as healthy vs. diseased and wild type vs. knockout, to identify differential interactions, thereby revealing functional implications associated with changes in cellular communication status.

Assuntos

Comunicação Celular , Redes Neurais de Computação , Ligantes , Comunicação

5.

scTenifoldKnk: An efficient virtual knockout tool for gene function predictions via single-cell gene regulatory network perturbation.

Osorio, Daniel; Zhong, Yan; Li, Guanxun; Xu, Qian; Yang, Yongjian; Tian, Yanan; Chapkin, Robert S; Huang, Jianhua Z; Cai, James J.

Patterns (N Y) ; 3(3): 100434, 2022 Mar 11.

Artigo em Inglês | MEDLINE | ID: mdl-35510185

RESUMO

Gene knockout (KO) experiments are a proven, powerful approach for studying gene function. However, systematic KO experiments targeting a large number of genes are usually prohibitive due to the limit of experimental and animal resources. Here, we present scTenifoldKnk, an efficient virtual KO tool that enables systematic KO investigation of gene function using data from single-cell RNA sequencing (scRNA-seq). In scTenifoldKnk analysis, a gene regulatory network (GRN) is first constructed from scRNA-seq data of wild-type samples, and a target gene is then virtually deleted from the constructed GRN. Manifold alignment is used to align the resulting reduced GRN to the original GRN to identify differentially regulated genes, which are used to infer target gene functions in analyzed cells. We demonstrate that the scTenifoldKnk-based virtual KO analysis recapitulates the main findings of real-animal KO experiments and recovers the expected functions of genes in relevant cell types.

6.

scInTime: A Computational Method Leveraging Single-Cell Trajectory and Gene Regulatory Networks to Identify Master Regulators of Cellular Differentiation.

Xu, Qian; Li, Guanxun; Osorio, Daniel; Zhong, Yan; Yang, Yongjian; Lin, Yu-Te; Zhang, Xiuren; Cai, James J.

Genes (Basel) ; 13(2)2022 02 18.

Artigo em Inglês | MEDLINE | ID: mdl-35205415

RESUMO

Trajectory inference (TI) or pseudotime analysis has dramatically extended the analytical framework of single-cell RNA-seq data, allowing regulatory genes contributing to cell differentiation and those involved in various dynamic cellular processes to be identified. However, most TI analysis procedures deal with individual genes independently while overlooking the regulatory relations between genes. Integrating information from gene regulatory networks (GRNs) at different pseudotime points may lead to more interpretable TI results. To this end, we introduce scInTime-an unsupervised machine learning framework coupling inferred trajectory with single-cell GRNs (scGRNs) to identify master regulatory genes. We validated the performance of our method by analyzing multiple scRNA-seq data sets. In each of the cases, top-ranking genes predicted by scInTime supported their functional relevance with corresponding signaling pathways, in line with the results of available functional studies. Overall results demonstrated that scInTime is a powerful tool to exploit pseudotime-series scGRNs, allowing for a clear interpretation of TI results toward more significant biological insights.

Assuntos

Biologia Computacional , Redes Reguladoras de Genes , Diferenciação Celular/genética , Biologia Computacional/métodos

7.

scTenifoldNet: A Machine Learning Workflow for Constructing and Comparing Transcriptome-wide Gene Regulatory Networks from Single-Cell Data.

Osorio, Daniel; Zhong, Yan; Li, Guanxun; Huang, Jianhua Z; Cai, James J.

Patterns (N Y) ; 1(9): 100139, 2020 Dec 11.

Artigo em Inglês | MEDLINE | ID: mdl-33336197

RESUMO

We present scTenifoldNet-a machine learning workflow built upon principal-component regression, low-rank tensor approximation, and manifold alignment-for constructing and comparing single-cell gene regulatory networks (scGRNs) using data from single-cell RNA sequencing. scTenifoldNet reveals regulatory changes in gene expression between samples by comparing the constructed scGRNs. With real data, scTenifoldNet identifies specific gene expression programs associated with different biological processes, providing critical insights into the underlying mechanism of regulatory networks governing cellular transcriptional activities.

8.

Single-Cell Expression Variability Implies Cell Function.

Osorio, Daniel; Yu, Xue; Zhong, Yan; Li, Guanxun; Yu, Peng; Serpedin, Erchin; Huang, Jianhua Z; Cai, James J.

Cells ; 9(1)2019 12 19.

Artigo em Inglês | MEDLINE | ID: mdl-31861624

RESUMO

As single-cell RNA sequencing (scRNA-seq) data becomes widely available, cell-to-cell variability in gene expression, or single-cell expression variability (scEV), has been increasingly appreciated. However, it remains unclear whether this variability is functionally important and, if so, what are its implications for multi-cellular organisms. Here, we analyzed multiple scRNA-seq data sets from lymphoblastoid cell lines (LCLs), lung airway epithelial cells (LAECs), and dermal fibroblasts (DFs) and, for each cell type, selected a group of homogenous cells with highly similar expression profiles. We estimated the scEV levels for genes after correcting the mean-variance dependency in that data and identified 465, 466, and 364 highly variable genes (HVGs) in LCLs, LAECs, and DFs, respectively. Functions of these HVGs were found to be enriched with those biological processes precisely relevant to the corresponding cell type's function, from which the scRNA-seq data used to identify HVGs were generated-e.g., cytokine signaling pathways were enriched in HVGs identified in LCLs, collagen formation in LAECs, and keratinization in DFs. We repeated the same analysis with scRNA-seq data from induced pluripotent stem cells (iPSCs) and identified only 79 HVGs with no statistically significant enriched functions; the overall scEV in iPSCs was of negligible magnitude. Our results support the "variation is function" hypothesis, arguing that scEV is required for cell type-specific, higher-level system function. Thus, quantifying and characterizing scEV are of importance for our understating of normal and pathological cellular processes.

Assuntos

Perfilação da Expressão Gênica/métodos , Redes Reguladoras de Genes , Análise de Célula Única/métodos , Algoritmos , Linhagem Celular , Regulação da Expressão Gênica , Humanos , Especificidade de Órgãos , Análise de Sequência de RNA/métodos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

RESUMO

RESUMO

Assuntos

ENVIAR RESULTADO:

SELEÇÃO DE REFERÊNCIAS

DETALHE DA PESQUISA